Why Supertagging Is Hard

نویسنده

  • François Toussenel
چکیده

Tree adjoining grammar parsers can use a supertagger as a preprocessor to help disambiguate the category1 of words and thus speed up the parsing phase dramatically. However, since the errors in supertagging propagate to this phase, it is vital to keep the error rate of the supertagger phase reasonably low. With very large tagsets coming from extracted grammars, this error rate can be of almost 20%, using standard Hidden Markov Model techniques. To combat this problem, we can trade a higher precision for increased ambiguity in the supertagger output. I propose a new approach to introduce ambiguity in the supertags, looking for a suitable trade-off. The method is based on a representation of the supertags as a feature structure and consists in grouping the values, or a subset of the values, of certain features, generally those hardest to predict.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stacking or Supertagging for Dependency Parsing - What's the Difference?

Supertagging was recently proposed to provide syntactic features for statistical dependency parsing, contrary to its traditional use as a disambiguation step. We conduct a broad range of controlled experiments to compare this specific application of supertagging with another method for providing syntactic features, namely stacking. We find that in this context supertagging is a form of stacking...

متن کامل

HPSG Supertagging: A Sequence Labeling View

Supertagging is a widely used speed-up technique for deep parsing. In another aspect, supertagging has been exploited in other NLP tasks than parsing for utilizing the rich syntactic information given by the supertags. However, the performance of supertagger is still a bottleneck for such applications. In this paper, we investigated the relationship between supertagging and parsing, not just to...

متن کامل

CCG Supertagging with a Recurrent Neural Network

Recent work on supertagging using a feedforward neural network achieved significant improvements for CCG supertagging and parsing (Lewis and Steedman, 2014). However, their architecture is limited to considering local contexts and does not naturally model sequences of arbitrary length. In this paper, we show how directly capturing sequence information using a recurrent neural network leads to f...

متن کامل

TOWARDS EFFICIENT STATISTICAL PARSING USING LEXICALIZED GRAMMATICAL INFORMATION by

For a long time, the goal of wide-coverage natural language parsers had remained elusive. Much progress has been made recently, however, with the development of lexicalized statistical models of natural language parsing. Although lexicalized tree adjoining grammar (TAG) is a lexicalized grammatical formalism whose development predates these recent advances, its application in lexicalized statis...

متن کامل

A Simple Approach for HPSG Supertagging Using Dependency Information

In a supertagging task, sequence labeling models are commonly used. But their limited ability to model long-distance information presents a bottleneck to make further improvements. In this paper, we modeled this long-distance information in dependency formalism and integrated it into the process of HPSG supertagging. The experiments showed that the dependency information is very informative for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004